Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Federal court records have been available online for nearly a quarter century, yet they remain frustratingly inaccessible to the public. This is due to two primary barriers: (1) the federal government's prohibitively high fees to access the records at scale and (2) the unwieldy state of the records themselves, which are mostly text documents scattered across numerous systems. Official datasets produced by the judiciary, as well as third-party data collection efforts, are incomplete, inaccurate, and similarly inaccessible to the public. The result is a de facto data blackout that leaves an entire branch of the federal government shielded from empirical scrutiny. In this Essay, we introduce the SCALES project: a new data-gathering and data-organizing initiative to right this wrong. SCALES is an online platform that we built to assemble federal court records, systematically organize them and extract key information, and-most importantly-make them freely available to the public. The database currently covers all federal cases initiated in 2016 and 2017, and we intend to expand this coverage to all years. This Essay explains the shortcomings of existing systems (such as the federal government's PACER platform), how we built SCALES to overcome these inadequacies, and how anyone can use SCALES to empirically analyze the operations of the federal courts. We offer a series of exploratory findings to showcase the depth and breadth of the SCALES platform. Our goal is for SCALES to serve as a public resource where practitioners, policymakers, and scholars can conduct empirical legal research and improve the operations of the federal courts. For more information, visit www.scales-okn.org.more » « less
-
The docket sheet of a court case contains a wealth of information about the progression of a case, the parties’ and judge’s decision-making along the way, and the case’s ultimate outcome that can be used in analytical applications. However, the unstructured text of the docket sheet and the terse and variable phrasing of docket entries require the development of new models to identify key entities to enable analysis at a systematic level. We developed a judge entity recognition language model and disambiguation pipeline for US District Court records. Our model can robustly identify mentions of judicial entities in free text (~99% F-1 Score) and outperforms general state-of-the-art language models by 13%. Our disambiguation pipeline is able to robustly identify both appointed and non-appointed judicial actors and correctly infer the type of appointment (~99% precision). Lastly, we show with a case study on in forma pauperis decision-making that there is substantial error (~30%) attributing decision outcomes to judicial actors if the free text of the docket is not used to make the identification and attribution.more » « less
An official website of the United States government
